One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

نویسندگان

Vedran Vukotic

Silvia-Laura Pintea

Christian Raymond

Guillaume Gravier

Jan C. van Gemert

چکیده

There is an inherent need for autonomous cars, drones, and other robots to have a notion of how their environment behaves and to anticipate changes in the near future. In this work, we focus on anticipating future appearance given the current frame of a video. Existing work focuses on either predicting the future appearance as the next frame of a video, or predicting future motion as optical flow or motion trajectories starting from a single video frame. This work stretches the ability of CNNs (Convolutional Neural Networks) to predict an anticipation of appearance at an arbitrarily given future time, not necessarily the next video frame. We condition our predicted future appearance on a continuous time variable that allows us to anticipate future frames at a given temporal distance, directly from the input video frame. We show that CNNs can learn an intrinsic representation of typical appearance changes over time and successfully generate realistic predictions at a deliberate time difference in the near future.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decomposing Motion and Content for Natural Video Sequence Prediction

We propose a deep neural network for the prediction of future frames in natural video sequences. To effectively handle complex evolution of pixels in videos, we propose to decompose the motion and content, two key components generating dynamics in videos. Our model is built upon the Encoder-Decoder Convolutional Neural Network and Convolutional LSTM for pixel-level prediction, which independent...

متن کامل

An efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network

Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...

متن کامل

A Temporally-Aware Interpolation Network for Video Frame Inpainting

We propose the first deep learning solution to video frame inpainting, a challenging instance of the general video inpainting problem with applications in video editing, manipulation, and forensics. Our task is less ambiguous than frame interpolation and video prediction because we have access to both the temporal context and a partial glimpse of the future, allowing us to better evaluate the q...

متن کامل

Learning to Generate Long-term Future via Hierarchical Prediction

We propose a hierarchical approach for making long-term predictions of future frames. To avoid inherent compounding errors in recursive pixellevel prediction, we propose to first estimate highlevel structure in the input frames, then predict how that structure evolves in the future, and finally by observing a single frame from the past and the predicted high-level structure, we construct the fu...

متن کامل

Gaussian Prediction Based Attention for Online End-to-End Speech Recognition

Recently end-to-end speech recognition has obtained much attention. One of the popular models to achieve end-to-end speech recognition is attention based encoder-decoder model, which usually generating output sequences iteratively by attending the whole representations of the input sequences. However, predicting outputs until receiving the whole input sequence is not practical for online or low...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

نویسندگان

چکیده

منابع مشابه

Decomposing Motion and Content for Natural Video Sequence Prediction

An efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network

A Temporally-Aware Interpolation Network for Video Frame Inpainting

Learning to Generate Long-term Future via Hierarchical Prediction

Gaussian Prediction Based Attention for Online End-to-End Speech Recognition

عنوان ژورنال:

اشتراک گذاری